Assembling proteomics data as a prerequisite for the analysis of large scale experiments

نویسندگان

  • Frank Schmidt
  • Monika Schmid
  • Bernd Thiede
  • Klaus-Peter Pleißner
  • Martina Böhme
  • Peter R Jungblut
چکیده

BACKGROUND Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments. RESULTS In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of Mycobacterium tuberculosis, Helicobacter pylori, Salmonella typhimurium and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML. CONCLUSION The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large Scale Experiments Data Analysis for Estimation of Hydrodynamic Force Coefficients Part 1: Time Domain Analysis

This paper describes various time-domain methods useful for analyzing the experimental data obtained from a circular cylinder force in terms of both wave and current for estimation of the drag and inertia coefficients applicable to the Morison’s equation. An additional approach, weighted least squares method is also introduced. A set of data obtained from experiments on heavily roughened circul...

متن کامل

Large Scale Experiments Data Analysis for Estimation of Hydrodynamic Force Coefficients

This paper describes the various frequency domain methods which may be used to analyze experiments data on the force experienced by a circular cylinder in wave and current to estimate drag and inertia coefficients for use in Morison’s equation. An additional approach, system identification techniques (SIT) is also introduced. A set of data obtained from experiments on heavily roughened circular...

متن کامل

Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry

Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...

متن کامل

Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry

Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...

متن کامل

Gene assembling: a new approach in molecular diagnosis of hereditary breast cancer

 Abstract Background: Many disease susceptibility genes are large and consist of many exons in which point mutations are scattered throughout. Scanning each exon individually represents a tedious task which can be time consuming and expensive. There has been increasing demand for rapid and accurate methods for full scanning of unknown point mutations in large multi-exon genes. Gene Assembling i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Chemistry Central Journal

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2009